A Customized Cross-Bar for Data-Shuffling in Domain-Specific SIMD Processors

نویسندگان

  • Praveen Raghavan
  • Satyakiran Munaga
  • Estela Rey Ramos
  • Andy Lambrechts
  • Murali Jayapala
  • Francky Catthoor
  • Diederik Verkest
چکیده

Shuffle operations are one of the most common operations in SIMD based embedded system architectures. In this paper we study different families of shuffle operations that frequently occur in embedded applications running on SIMD architectures. These shuffle operations are used to drive the design of a custom shuffler for domain-specific SIMD processors. The energy efficiency of various crossbar based custom shufflers is analyzed and compared with the widely used full crossbar. We show that by customizing the crossbar to implement specific shuffle operations required in the target application domain, we can reduce the energy consumption of shuffle operations by up to 80%. We also illustrate the tradeoffs between flexibility and energy efficiency of custom shufflers and show that customization offers reasonable benefits without compromising the flexibility required for the target application domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Domain-Specific Language and Compiler for Stencil Computations on Short-Vector SIMD and GPU Architectures

Stencil computations are an integral part of applications in a number of scientific computing domains, such as image processing and partial differential equations. We describe a domain-specific language for regular stencil computations, that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, that generates optimized code for multi-cor...

متن کامل

21 Processor Architectures For

In this chapter, we present contemporary VLSI processor architectures that support multimedia applications. We classified these processors into two groups: dedicated multimedia processors, which perform dedicated multimedia functions, such as MPEG encoding or decoding, and general-purpose processors that provide support for multimedia. Dedicated multimedia processors use either function-specifi...

متن کامل

Preliminary Work on a Mechanism for Testing a Customized Architecture

Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors (GPPs) to accelerate domain-specific applications is an active field of research. Those domain-specific customized processors are mostly evaluated in simulation environment...

متن کامل

Operating Systems (IT308) Second In-semester Exam

Option D. Single instruction multiple data, is a form of domain decomposition where the same instruction is executed on the various parts of the data. The different portions of the data are handled by the multiple processors available. Multiple instruction is various operations being performed on the same data. The different tasks can be allocated to different processors in order to parallelise...

متن کامل

Permutation Networks as Application Specific Interconnection Networks for SIMD DSPs

Parallelization is one possibility to enhance the performance in processors. For singleinstruction-multiple-data (SIMD) processors one requirement is the possibility to parallelize algorithms so that one algorithm is partitioned to the parallel processing units of the processor. This means that data has to be exchanged between the processing units. Hence, these separate units have to be connect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007